Duplicate Image Detection in Large Scale Databases

نویسندگان

  • Pratim Ghosh
  • E. Drelie Gelasca
  • K. R. Ramakrishnan
  • B. S. Manjunath
چکیده

We propose an image duplicate detection method for identifying modified copies of the same image in a very large database. Modifications that we consider include rotation, scaling and cropping. A compact 12 dimensional descriptor based on Fourier Mellin Transform is introduced. The compactness of this descriptor allows efficient indexing over the entire database. Results are presented on a 10 million image database that demonstrates the effectiveness and the efficiency of this descriptor. In addition, we also propose extension to arbitrary shape representations and similar scene detection and preliminary results are also included.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speed-up Multi-modal Near Duplicate Image Detection

Near-duplicate image detection is a necessary operation to refine image search results for efficient user exploration. The existences of large amounts of near duplicates require fast and accurate automatic near-duplicate detection methods. We have designed a coarse-to-fine near duplicate detection framework to speed-up the process and a multi-modal integration scheme for accurate detection. The...

متن کامل

Near Duplicate Image Detection: min-Hash and tf-idf Weighting

This paper proposes two novel image similarity measures for fast indexing via locality sensitive hashing. The similarity measures are applied and evaluated in the context of near duplicate image detection. The proposed method uses a visual vocabulary of vector quantized local feature descriptors (SIFT) and for retrieval exploits enhanced min-Hash techniques. Standard min-Hash uses an approximat...

متن کامل

An image signature for any kind of image

We describe an algorithm for computing an image signature, suitable for first-stage screening for duplicate images. Our signature relies on relative brightness of image regions, and is generally applicable to photographs, text documents, and line art. We give experimental results on the sensitivity and robustness of signatures for actual image collections, and also results on the robustness of ...

متن کامل

TA-DRD: A Three-step Automatic Duplicate Record Detection

Duplicate record detection is a key step in Deep Web data integration, but the existing approaches do not adapt to its large-scale nature. In this paper, a three-step automatic approach is proposed for duplicate record detection in Deep Web. It firstly uses cluster ensemble to select initial training instance. Then it utilizes tri-training classification to construct classification model. Final...

متن کامل

Fast Convex Layers Algorithm for Near-Duplicate Image Detection

This paper builds on a novel, fast algorithm for generating the convex layers on grid points with linear time complexity. Convex layers are extracted from the binary image. The obtained convex hulls are characterized by the number of their vertices and used as representative image features. A computational geometric approach to near-duplicate image detection stems from these features. Similarit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007